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CLAIMS 



1 . A method for generating a summary of a plurality of related documents in a 
collection comprising: 

extracting phrases having focus elements from the plurality of 

5 documents; 

performing phrase intersection analysis on the extracted phrases to 
generate a phrase intersection table; 

performing temporal processing on the phrases in the phrase 
intersection table; and 
1 0 performing sentence generation using the phrases in the phrase 

intersection table. 



2. The method of generating a summary as defined by claim 1 , wherein the 
phrase intersection analysis comprises: 

representing the phrases in tree structures having root nodes and 
1 5 children nodes; 

selecting those tree structures with verb root nodes; 
comparing the selected root nodes to the other root nodes to identify 
identical nodes; 

applying paraphrasing rules to non-identical root nodes to determine if 
20 non identical nodes are equivalent; and 

evaluating the children nodes of those tree structures where the parent 
nodes are identical or equivalent. 

3. The method of claim 2, wherein the tree structure is a DSYNT tree structure. 

25 4. The method of claim 2, wherein the paraphrasing rules are selected from the 
group consisting of ordering of sentence components, main clause versus a relative 
clause, different syntactic categories, change in grammatical features, omission of an 
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empty head, transformation of one part of speech to another, and semantically related 
words. 

5. The method of claim 1, wherein the temporal processing includes: 
time stamping phrases based on a first occurrence of the phrase in the 

collection; 

substituting date certain references for ambiguous temporal references; 
ordering the phrases based on the time stamp; and 
inserting a temporal marker if a temporal gap between phrases exceeds 
a threshold value. 

6. The method of claim 1, further comprising a phrase divergence processing 
operation. 

7. The method of claim 1, wherein the sentence generation includes mapping 
phrases to an input format of a language generation engine and operating the language 
generation engine. 

15 8. A system for generating a summary of a plurality of related documents in a 
collection comprising: 

a storage device for storing the documents in the collection; 
a lexical database; and 

a processing subsystem, the processing subsystem being operatively 
20 coupled to the storage device and the lexical database, the processing subsystem being 
programmed to access the documents in the storage device and: 

using the lexical database to extract phrases having focus elements 
from the plurality of documents; 

performing phrase intersection analysis on the extracted phrases to 
25 generate a phrase intersection table; 

performing temporal processing on the phrases in the phrase 
intersection table; and 
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performing sentence generation using the phrases in the phrase 
intersection table. 

9. The system for generating a summary as defined by claim 9, wherein the 
phrase intersection analysis processing further comprises: 

representing the phrases as data structures having root nodes and 
children nodes; 

selecting those data structures with verb root nodes; 
comparing the selected root nodes to the other root nodes to identify 
identical nodes; 

applying paraphrasing rules to non-identical root nodes to determine if 
non identical nodes are equivalent; and 

evaluating the children nodes of those tree structures where the parent 
nodes are identical or equivalent. 

15 10. The system of claim 9, wherein the data structure is a DSYNT tree structure. 

1 1 . The system of claim 9, wherein the paraphrasing rules are selected from the 
group consisting of ordering of sentence components, main clause versus a relative 
clause, different syntactic categories, change in grammatical features, omission of an 
empty head, transformation of one part of speech to another, and semantically related 

20 words. 

12. The system of claim 8, wherein the temporal processing includes: 

time stamping phrases based on a first occurrence of the phrase in the 

collection; 

substituting date certain references for ambiguous temporal references; 
25 ordering the phrases based on the time stamp; and 

inserting a temporal marker if a temporal gap between phrases exceeds 
a threshold value. 
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13. The system of claim 8, further comprising a phrase divergence processing 
operation. 

14. The system of claim 8, wherein the processing subsystem includes a language 
generation engine and wherein sentence generation includes mapping phrases to an 

5 input format of the language generation engine and then operating the language 
generation engine. 

15. The system of claim 8, wherein the storage device for storing the documents in 
the collection is remotely located from the processing subsystem. 

16. A computer readable media for programming a computer system to perform a 
10 method of generating a summary of a plurality of related documents in a collection 

comprising: 

extracting phrases having focus elements from the plurality of 

documents; 

performing phrase intersection analysis on the extracted phrases to 
1 5 generate a phrase intersection table; 

performing temporal processing on the phrases in the phrase 
intersection table; and 

performing sentence generation using the phrases in the phrase 
intersection table. 

20 1 7. The computer readable media of claim 1 6, wherein the phrase intersection 
analysis comprises: 

representing the phrases in tree structures having root nodes and 
children nodes; 

selecting those tree structures with verb root nodes; 
25 comparing the selected root nodes to the other root nodes to identify 

identical nodes; 

applying paraphrasing rules to non-identical root nodes to determine if 
non identical nodes are equivalent; and 



WO 00/49517 




PCT/US00/04118 



14 

evaluating the children nodes of those tree structures where the parent 
nodes are identical or equivalent. 

18. The computer readable media of claim 1 7, wherein the tree structure is a 
5 DSYNT tree structure. 

19. The computer readable media of claim 1 7, wherein the paraphrasing rules are 
selected from the group consisting of ordering of sentence components, main clause 
versus a relative clause, different syntactic categories, change in grammatical features, 
omission of an empty head, transformation of one part of speech to another, and 

10 semantically related words, 

20. The computer readable media of claim 16, wherein the temporal processing 
includes: 

time stamping phrases based on a first occurrence of the phrase in the 

collection; 

15 substituting date certain references for ambiguous temporal references; 

ordering the phrases based on the time stamp; and 
inserting a temporal marker if a temporal gap between phrases exceeds 
a threshold value. 

2 1 . The computer readable media of claim 1 6, further comprising a phrase 
20 divergence processing operation. 

22. The computer readable media of claim 16, wherein the sentence generation 
includes mapping phrases to an input format of a language generation engine and 
operating the language generation engine. 



